Detecting transient gravitational waves in non- Gaussian noise with partially redundant analysis 

methods 
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There is a broad class of astrophysical sources that produce detectable, transient, gravitational waves. Some 
searches for transient gravitational waves are tailored to known features of these sources. Other searches make 
few assumptions about the sources. Typically events are observable with multiple search techniques. This work 
describes how to combine the results of searches that are not independent, treating each search as a classifier for 
a given event. This will be shown to improve the overall sensitivity to gravitational-wave events while directly 
addressing the problem of consistent interpretation of multiple trials. 



I. INTRODUCTION 



A variety of astrophysical sources are capable of producing 
transient gravitational-wave signals with sufficient strength to 
be detectable by ground-based gravitational-wave detectors 
such as Laser Interferometer Gravitational-wave Observatory 
(LIGO) IH]. Such systems include coalescing compact bina- 
ries (CBC) consisting of neutron stars and/or black holes 
Currently, the same experimental data sets are searched by sev- 
eral analysis methods. These methods use various signal mod- 
els and data processing algorithms and may have different re- 
sponses to signals and non-Gaussian noise artifacts f^. Since 
these searches are not independent, a single, more powerful 
result can be obtained by combining the results of multiple 
search methods. 

The analysis methods can be divided into two classes by 
their assumptions about the signal properties. The first class 
assumes that the waveforms are well modeled and typically 
employs matched-filtering. For this reason, these methods is 
referred to as template-based searches. The second class as- 
sumes only basic time frequency properties about the signals. 
These methods are referred to as un-modeled searches. In 
the template-based searches, there are often several ways to 
construct signal models. This means that if a detectable sig- 
nal exists in the data, it may not perfectly match the signal 
model chosen for the analysis. For example, the templates 
may be constructed using different approximation techniques, 
or they may correspond to different parts of the gravitational- 
wave signal (e.g. inspiral or ringdown stages of the compact 
binary coalescence). The un-modeled searches make mini- 
mal assumptions about the shape of the signal and are de- 
signed to detect any short outburst of gravitational radiation 
in a given frequency band. Both search classes employ algo- 
rithms for identifying and discarding the non-Gaussian noise 
artifacts. To their advantage, un-modeled searches are able to 



detect a wide class of signals. However, the template-based 
searches generally achieve higher detection efficiency for sig- 
nals matching the templates. 

A gravitational-wave search produces a list of candidate 
events. Re-analyzing the data with multiple methods may 
increase the odds of detecting a gravitational wave. At the 
same time it has the negative effect of generating redundant 
lists of gravitational-wave candidates and increasing the num- 
ber of trials, which makes it more difficult to assess the sig- 
nificance of an event. Sensitivity domains of many searches 
overlap, meaning that multiple searches may detect the same 
gravitational-wave signal. The detection efficiency of a given 
search depends on a variety of factors and it can be difficult to 
interpret results of multiple searches. 

It is apparent that gravitational-wave searches would ben- 
efit from a procedure to consistently combine results into a 
joint detection or model exclusion statement for a given pop- 
ulation of gravitational-wave sources. We apply the general 
framework for detection of gravitational waves in the pres- 
ence of non-Gaussian noise developed in our earlier paper 101 
to this problem. Treating the output of each search method as 
a classifier for gravitational-wave candidate events, we con- 
struct a unified ranking for all candidate events that is easy to 
implement and interpret. We test it by combining candidate 
events from four different search methods that analyze simu- 
lated gravitational-wave signals from compact binary coales- 
cence embedded in the data taken during LIGO's S4 science 
run. We find that this procedure is robust and can be used in 
ongoing and future searches. Interpretation of the combined 
results is straightforward. In particular, the calculation of the 
posterior probability distribution or upper limit for the rate of 
coalescing binaries can be carried out as it is normally per- 
formed for a single search. The combined upper limit calcula- 
tion was addressed in [5], assuming multiple search methods 
were performed. We briefly discuss the relationship between 
the method suggested in that paper and ours. 
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The paper is organized as follows. In Section|II] we formu- 
late the problem of combining results from multiple searches 
and construct a statistic for the joint analysis. We conclude 
this section with a discussion of a rate upper limit calculation 
for the combined search and its relation to the method sug- 
gested in list). In Section Hn] we test our procedure by com- 
bining results from four different search methods. We briefly 
describe each method, the data, and the model signals. This is 
followed by details of the simulations and a discussion of the 
results. In AppendiceslAlandlBl we derive a formal expression 
for the multivariate statistic, which accounts for correlations 
between the searches, and analyze the limits of applicability 
of our procedure. 



II. METHOD FOR COMBINING SEARCHES 

In this section, we establish a method for combining results 
from different gravitational-wave searches performed over the 
same data. The method builds on the general approach de- 
scribed in |4|]. We construct a unified statistic for searches by 
treating each as a separate, possibly redundant, classifier for a 
given event. 

Each search method aims to classify observational data into 
a list of candidate events, ranked by their likelihood to be a 
gravitational-wave signal. In the data analysis process, the 
data are analyzed and assigned a rank, r, a real number reflect- 
ing the odds that the data contain a gravitational-wave signal. 
Ordering time series data by amplitude is one simple method 
for ranking candidate events. The rank (or amplitude) is com- 
pared to a pre-established threshold, a boundary that separates 
signal-like data with sufficient confidence. In this way, the pro- 
cedure classifies data on a scale from not signal-like to signal- 
like. A search method may classify events by complicated 
consistency tests and noise rejection schemes, but conceptu- 
ally any search can be thought of as a mapping from the space 
of data to the space of real numbers that indicate their rank. 
We will assume that such a ranking procedure exists for any 
search method, i, and that the result, r^, indicates the likeli- 
hood that a signal is present in a given search. 

Different search methods employ a variety of techniques, 
data processing algorithms and waveform models. As there 
are a number of potential gravitational-wave sources, the 
search targets may vary as well. Separate searches may pro- 
vide different information about a particular population of 
sources. Hence, it is important to extract as much informa- 
tion as possible by combining the results of various searches. 
When multiple searches analyze the same data, the output of 
each search, r^, can be further processed to make the most in- 
formative detection or rate limit statement for a population of 
gravitational-wave sources. In doing so, it is important to en- 
sure that there is no loss of detection efficiency when one or 
more of the methods has a high false alarm rate or is uninfor- 
mative or irrelevant for the targeted source population. 

For a given event with rank, r,, one can compute the poste- 
rior probability that it is a gravitational-wave signal, p{l | r,). 
Following the steps outlined in this probability can be ex- 



pressed as 



P(l I r,) = 



A(rO 



A(r,)+p(0)Ml) 



where the likelihood ratio, A(ri), is defined by 
/p(r, |h,l)p(h|l)dh 



A(r,) = 



P(r. I 0) 



(1) 



(2) 



and p{ri | h, 1) is the probability of observing in the pres- 
ence of the signal h, p{h 1 1) is the prior probability to receive 
that signal, and p(ri | 0) is the probability of observing in 
the absence of any signal. The targeted astrophysical popula- 
tion of sources is completely described by p{h \ 1), where h 
denotes all possible intrinsic (e.g. masses of compact objects 
in the binary) and extrinsic (e.g. distance to the source, sky 
location) source parameters. 

If an event is identified as a plausible candidate by several 
search methods, p(l | r^) can be calculated for each search 
based on the ranking, ij, the event received. Thus, information 
from each search can be directly compared. The most relevant 
search results in the highest posterior probability for a signal 
to be present in the data. According to Eq. ([T]i, this probability 
is a monotonically increasing function of the likelihood ratio, 
A(ri). As such, comparing likelihood ratios is equivalent to 
comparing the posterior probabilities, p{l | r^). Therefore, the 
likelihood ratio can be used as a unified ranking statistic to 
combine the output of all searches. 

Strictly speaking, the denominator in Eq. ^ should contain 
contributions from all gravitational-wave sources not included 
in the targeted population, p(h | 1). We neglect these terms 
because typically their contribution is very small. If a popula- 
tion of sources induces a response very similar to that of the 
targeted sources, then these classifiers will not distinguish sig- 
nals from the two different populations. This can lead to an 
overestimation of event rates for the targeted population, how- 
ever no signal would be missed. Further refinement of the data 
analysis techniques or detectors themselves would be required 
to distinguish between the signals from these sources. 

We define the ranking statistic for the joint search to be 



l"joint 



;{A(ri),A(r2),...,A(r„)} 



(3) 



where maximization is carried out over simultaneous events. 
Though this choice does not make use of all available infor- 
mation (we neglect correlations between the classifier's ranks; 
see Appendix |A] for multivariate treatment of the problem), it 
does offer some advantages. It is straightforward to compute 
A(ri) for each search method and simply take the largest. This 
has a simple interpretation: events from each search are com- 
pared based on the ratio of sensitivity of the method to the 
targeted sources and the search's false alarm rate. The event 
that is most likely to be a gravitational wave is kept. As a 
result, the searches are combined according to the best clas- 
sifier for each event. Events classified by noisy, insensitive 
searches receive a low likelihood-ratio ranking and therefore 
do not contaminate the overall sensitivity of the analysis. We 
further discuss the limits of applicability for this ranking statis- 
tic in AppendixlB] 
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As in the case of a single search method, the result of com- 
bining searches using the maximum likelihood-ratio statistic, 
Eq. ©, is a list of events. They can be treated as the output of 
a single search, with their significance evaluated by estimat- 
ing the background. In the next section, we discuss how to do 
this. Having a single list of events allows for straightforward 
interpretation of results. The most significant events can be 
further studied and possibly promoted to the list of detected 
gravitational-wave signals. The posterior probability distribu- 
tion or the upper limit on the rate of coalescence can be cal- 
culated following any of the methods developed for a single 
search 

The upper limit calculation for multiple searches described 
in is! differs from our method. In [5], searches are treated 
as counting experiments and the upper limit on the rate of 
events is calculated using the total number of events above 
some fixed threshold and Poisson statistics. To apply this 
method when multiple searches are performed, a prescrip- 
tion is needed for determining how manyevents each search 
should contribute to the total count. In events are classi- 
fied by combinations of searches that generated them. The 
problem is reduced to the choice of foliation by a family of 
exclusion surfaces, S{C), of the space of the number of events 
in each category, Nq, where q runs through all possible com- 
binations of m (m < n) out of n searches. In the paper, the 
authors suggest and discuss several plausible choices of linear 
surfaces, 5(C), that lead to different upper limits. By construc- 
tion, the maximum likelihood ratio, Eq. (|3]l, ensures that the 
total number of events each search contributes on average to 
the joint search is proportional to the ratio of its efficiency to 
detect the targeted signals to its background. This construc- 
tion is closely related to the "single combination" option of 
Jit], in which only the most sensitive search contributes to the 
upper limit. Note though, that in our method the most sensi- 
tive search is determined during the analysis on event by event 
basis. This relieves an analyst from determining before hand 
which of the searching methods is the most sensitive. Often 
sensitivity is a very complicated function of signal's param- 
eters and there might not be a single most sensitive search 
method that covers all signals. In this case, one would have 
to split the signal parameter space into subdomains, within 
which a single most sensitive search method exist, and carry 
out the upper limit calculation for each of the domains inde- 
pendently. In practice, this may prove to be a formidable task. 
The maximum-likelihood ratio procedure is universal and is 
almost trivial to implement, as we show it the next section. 
Its other important advantage is accounting for background 
noise present in each of the search methods. The choice be- 
tween the methods is based not only on their sensitivity but 
also their susceptibility to the noise artifacts. In Jsl, the au- 
thors also mention the necessity to include information about 
the background to achieve more optimal upper limits. 

The maximum likelihood construction ignores correlation 
between the searches. The optimal way to account for it is 
to define the multivariate likelihood-ratio ranking described 
in Appendix lAl Unfortunately, implementing this ranking for 
more that two search methods is not feasible. Also, we argue 
that in most practical situations the net positive effect of corre- 



lations is small. The multivariate likelihood-ratio, Eq. ( lAll ). 
defines the optimal exclusion surfaces, S{C,), for the upper 
limit calculation method of Jit]. These surfaces generally 
are non-linear and, therefore, do not directly correspond to 
any of the choices considered in The closest in spirit is 
the "efficiency-weighted combination" suggested by the au- 
thors of fsl], where contributions from each combination of the 
search methods are weighted proportionally to their sensitiv- 
ity to signals. Using notation of ISii] and accounting for noise 
contribution, the corresponding exclusion planes are defined 
by the normal vector k = {cq/bq), where eq is the probabil- 
ity of a signal to be detected by the q^^ combination of the 
searches and bq is the number of background events in this 
combination. 



III. TESTING THE MAXIMUM LIKELIHOOD-RATIO 
STATISTIC WITH NON-GAUSSIAN DATA 

The maximum likelihood-ratio statistic, Eq. (O, provides a 
natural way to combine results of several search methods into 
a single joint search. It possesses several attractive qualities 
and is expected to result in no loss of efficiency in the most 
practical situations (see Appendix IB] for discussion of this). 
Still, it is important to verify this in conditions that mimic the 
strong non-Gaussian noise that is encountered in the search for 
gravitational waves in real data. To simulate a real life appli- 
cation of our procedure, we employ four search methods that 
are currently used to detect gravitational waves from coalesc- 
ing binaries with the LIGO and Virgo observatories. We ana- 
lyze simulated signals inserted in the data from LIGO's fourth 
scientific run (S4) and combine results of these analyses us- 
ing the maximum likelihood-ratio statistic. We estimate the 
efficiency of the combined search in detecting these signals in 
the typical LIGO noise and compare it to the efficiencies of 
the individual searches. 



A. Data and signals 

We insert simulated signals into data collected between 
February 24 and March 24, 2005, during LIGO's S4 run. The 
data was taken by three detectors: the HI and H2 co-located 
detectors in Hanford, WA and LI in Livingston, LA. Several 
searches for gravitational waves were performed in these data, 
but no gravitational-wave candidates were identified Jgt jTltl . 
For this work, 15 days of triple coincidence data were used, 
which is the sum of all times during S4 when all three detec- 
tors were simultaneously operating in science mode. 

Our signals include three kinds of binaries: neutron star- 
neutron star (BNS), neutron star-black hole (NSBH) and 
black hole-black hole (BBH) binaries. We use non-spinning 
waveforms to model signals from these binaries. For BNS, 
these are post-Newtonian waveforms fl2tj22tl . Newtonian or- 
der in amplitude and second order in phase, calculated using 
the stationary phase approximation Jl3l l20l uM with the up- 
per cut-off frequency set by the Schwarzschild innermost sta- 
ble circular orbit (ISCO). Signals from all other binaries are 
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approximated by the effective one body numerical relativity 
(EOBNR) waveforms HI [HQ. The former waveforms de- 
scribe only the inspiral phase of the coalescence, whereas the 
latter also include merger and ringdown phases. 

The simulated signals are injected into non-overlapping 
2048-second blocks of data. To improve the statistic, multi- 
ple signal populations are inserted in the data and indepen- 
dently analyzed. Signals are split into three categories by total 
mass of the binary: 2-6 il/©, 6-100 Mq, and 100-350 Mq. 
The lowest mass range includes only BNS systems. Within 
each mass range, signals are distributed uniformly in distance 
or the inverse of distance. The BNS range covers 1-20 Mpc 
while other systems reach from 1-200 Mpc. In order to repre- 
sent realistic astrophysical population with probability density 
function scaling as distance squared, the simulated signals are 
appropriately re-weighted and are counted according to their 
weights. All other parameters of the signals have uniform dis- 
tribution. In total, there are 943, 2245, and 2237 signals in- 
jected within each mass category, respectively. 



B. Search methods 

Four search methods, each representing one of the standard 
searches for transient gravitational-wave signals in LIGO and 
Virgo data, are used to perform this joint analysis. Brief de- 
scriptions of the search methods are given below. The first 
three are template-based searches, while the last one does not 
rely on any specific signal model. 

The low-mass CBC pipeline targets binaries with total mass 
below 35 Mq. The data from each interferometer are match- 
filtered with a bank of non-spinning post-Newtonian wave- 
forms lfl2l - l22ll covering binary mass combinations with total 
mass in the range 2-35 Mq. The template waveforms are 
calculated in the frequency domain using the stationary phase 
approximation lfl3l l20l I21I1 to Newtonian order in amplitude 
and second PN order in phase. The waveforms are extended 
up to the Schwarzschild ISCO. When the signal-to-noise ratio 
(SNR) time series for a particular template crosses the thresh- 
old of 5.5, a single-interferometer trigger is recorded. These 
triggers are required to pass waveform consistency and coin- 
cidence tests with triggers from other interferometers. The 
surviving triggers are ordered by a ranking statistic and form 
a ranked set of candidate events. For detailed descriptions of 
this pipeline and recent search results, see lfTll[34l - [37ll . 

The high-mass CBC pipeline is similar to its low-mass 
counterpart, however it is designed to target binaries with total 
mass between 25-100 Mq. The EOBNR family of templates 
used in a high-mass search has waveforms covering the evo- 
lution of a coalescing binary from late inspiral to ringdown. 
Other than the choice of templates, the analysis is quite simi- 
lar to the low-mass search. The high-mass CBC pipeline was 
used to search for gravitational waves from binary black holes 
in the S5LIGO data Q. 

The ringdown pipeline was developed to search for 
gravitational-wave signals corresponding to the post-merger 
phase of the binary coalescence. After two compact objects 
merge, a single, highly perturbed black hole forms and ra- 



diates energy while it settles down to a stable Kerr solution. 
The pipeline constructs its template bank from the dominant, 
I = 2 and m = 2, black hole quasi-normal modes charac- 
terized by a single frequency and quality factor. The template 
bank used in the ringdown search spans the most sensitive part 
of the LIGO frequency band, 50 Hz-2 kHz. Quality factors 
between 2-20 are used, corresponding to a range for the fi- 
nal black hole spin between non-spinning and a ~ 0.994. As 
with the other search methods previously described, candidate 
events are ranked after being detected by multiple interferom- 
eters and passing several consistency tests. This pipeline was 
used to search for gravitational waves in the S4 data |@| . 

Coherent WaveBurst is a gravitational-wave burst anal- 
ysis pipeline designed to detect signals from transient 
gravitational-wave sources. It uses minimal information about 
the signal model, instead using the cross-correlated excess 
power from the gravitational-wave signal across a network of 
interferometers Il39ll . The pipeline enforces the signal hypoth- 
esis by maximizing a likelihood functional that describes the 
expected signal response of an impinging gravitational wave 
given its source location in the sky combined over the network 
of interferometers ll40ll . Triggers are generated from the inter- 
ferometer network by combining time-frequency maps using 
wavelet transformations of the interferometer time-series data. 
From these maps, the likelihood of a trigger is calculated by 
the pipeline from the correlation of the whitened data streams 
weighted by the network's antenna patterns. This pipeline was 
used in the LIGO S4 iH and LIGOA'irgo S5/VSR1 61 gl 
searches for un-modeled short duration transients, as well as 
searches for black hole binaries Ii44il . 

We note that our analysis does not include the most recent 
innovations developed to improve the efficiency of each of 
these pipelines. In particular, we do not categorize the candi- 
date events by the template mass and coincidence type - a nov- 
elty introduced in the low- and high-mass CBC pipelines dur- 
ing the analyses of S5 LIGO data and the S5/VSR1 data from 
LIGO and Virgo. Also, we use the default settings for the S5 
analysis (S4 for the ringdown search) for numerous pipelines' 
parameters without attempting to re-tune them. We choose 
to perform simulations without the most up-to-date and fully 
tuned versions of the pipelines to save time. This is justified 
because our algorithm for combining searches is ignorant of 
the inner-workings of each pipeline. This makes the combined 
search robust against small changes in the individual analysis 
algorithms. For our purpose, it is sufficient to use somewhat 
simplified versions of the pipelines, as we do not expect these 
results to change dramatically when incremental changes oc- 
cur as the pipelines evolve. 



C. Algorithm for combining searches 

The procedure for combining candidate events identified by 
different classifiers is straightforward and based on Eq. (O. 
The first step is the calculation of the likelihood ratio, A(ri), 
defined by Eq. dU, for every event. Notice that since the nu- 
merator depends on the population of signals through p(h 1 1), 
A(ri) is not just a trivial re-scaling of a rank assigned by a 
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classifier to an event. The likelihood ratio estimates the sig- 
nificance of each event in the context of a gravitational-wave 
detection from the targeted population of sources specified via 
p(h 1 1) and p{Ti \ h, 1). As a result, events are ranked by the 
odds of being produced by the classifier in response to the 
targeted signals, rather than noise. Depending on the popula- 
tion of sources, some classifiers may not provide any useful 
information. In that case, events provided by such classifiers 
receive a very low likelihood ratio rank and are effectively re- 
moved from the search. This is a desirable feature that makes 
the procedure robust against nuisance classifiers. 

In order to compute the one-dimensional likelihood ratio 
given in Eq. (|2]i, we need to measure the classifiers' response 
to the gravitational-wave signals interposed over noise and to 
background noise only. For the latter, we use a common back- 
ground estimation technique for gravitational-wave searches 
— shifting recorded data from the non-colocated interferom- 
eters in time with respect to each other ifTH l34l - [37ll . If the 
shift is much longer than the gravitational-wave travel time be- 
tween the detector sites (« 10 ms for a Hanford-Livingston de- 
tector pair), then the resulting time-shifted data are guaranteed 
to contain no coherent gravitational-wave signals. These data, 
when analyzed by the classifier, represent the background of 
the search. In the low-mass CBC, high-mass CBC, and ring- 
down pipelines, we perform 2000 forward-in-time shifts of 
the LI data with time steps of 7 seconds relative to HI and 
H2. The Coherent WaveBurst search uses 100 forward-in- 
time shifts of the LI data with 5-second time steps. Each 
time-shifted data set produces an independent sample of the 
background. The time-shifted data are analyzed and all back- 
ground events are recorded. 

For background events, it is significantly easier to estimate 
the cumulative probability density function (cdf), P{Ti \ 0) = 
J^Pi^i I 0) dr^, than the probability density function (pdf), 
p{ri I 0). Each background data set represents an independent 
trial observation of duration, T. Therefore, for a trigger, r,, 
the ratio of the number of the trial observations that produced 
a trigger with the rank > to the total number of trial ob- 
servations provides an estimate of the probability, P^lri | 0), 
of the classifier producing an equally or higher ranked event 
in the analysis of noise alone. This probabihty is a monotonic 
function of P{ri \ 0) and experiment duration, T, 



PT{T^ I 0) = 1 - (1 - P(r, I 0))^/^« , 



where Tq is the duration of a unit experiment, which can be 
classified by a single rank, r^. The scale for To is set by the 
duration of the gravitational-wave signal, the time scale on 
which data samples can be considered uncorrected. In prac- 
tice, given that all methods analyze the same amount of data, 
the two probabilities /^(ri | 0) and P{Ti \ 0) are equivalent 
for the purpose of ranking the candidate events since one is 
a monotonic function of the other. Computation of the back- 
ground cdf curve, PT(ri | 0), is a trivial task. First, the single 
event with the highest rank from every background data set 
is chosen. Then, for any value of r^, one simply counts the 
number of these events with rank r'^ > r^, divided by the total 
number of background data sets. In this way, the background 
cdf curves are calculated for each search method. 



In order to measure the response of each search method to 
the targeted gravitational-wave signals and calculate the nu- 
merator in Eq. (|2]i, several populations of simulated signals 
are injected into the data and processed by the pipelines. For 
each search method, events identified with the injected sig- 
nals are recorded along with the parameters of the signals. 
As with background events, it is much easier to compute the 
cdf, P{Ti 1 1), of the signals. For an event with rank r^, this 
probability is approximated by the ratio of the number of in- 
jected signals with rank > to the total number of in- 
jected signals. Using this algorithm we compute cdf curves, 
P{Ti I Sj, 1), for each search method, i, and mass category 
of the simulated signals, Sf 2-6 A/©, 6-100 Mq, and 100- 
350 Mq, which represent the intended targets of the low-mass 
CBC, high-mass CBC, and burst/ringdown pipelines, respec- 
tively. 

Having pre-computed the background and signal cdf curves, 
the algorithm for combining the analysis pipelines can be sum- 
marized in the following two steps. First, every event, r^, pro- 
duced by the i*'' classifier is assigned the log-likelihood-ratio 
ranking given by 



i(r, I S,) 



In 



PT(r,|0) 



(5) 



in which the ratio of pdfs in Eq. dU is approximated by the 
ratio of cdfs. For the sake of brevity in what follows we 
omit the "log" from the ranking name and refer to L(ri | Sj) 
as the likelihood-ratio ranking. Second, all events from all 
classifiers are mixed together and clustered, retaining events 
= with the highest likelihood-ratio ranking, L{i-i \ Sj), within 
the specified time window. These events form the final list 
of gravitational-wave candidates. The time window is approx- 
imately equal to the autocorrelation time for an average sig- 
nal injected in the data. In our simulations, we set it to 10 
seconds. Events separated in time by more than 10 seconds 
are uncorrected and therefore may correspond to different sig- 
nals iflTl l34l - [37ll . This last step effectively implements max- 
imization in Eq. ^ over the likelihood ratio for coincident 
events identified by multiple search methods. 

We expect the cdf approximation used in Eq. (|5]l to be fair in 

(4) 

the context of our simulations. Injection and background dis- 
tributions are one-dimensional, monotonic functions of rank. 
They generally fall off as some negative power of rank. De- 
tectable signals lie on the tail of the background distribution. 
Under these conditions, the difference between using the pdf 
or cdf in the likelihood ratio is insignificant. Nevertheless, 
we should stress that this may not be the case in general and 
proper estimation of signal and background probability distri- 
butions may be required. 



Before proceeding to the discussion of simulation results, 
we note that L(ri \ Sj), defined by Eq. (|5]i, depends on the 
population of injected signals, Sj. Therefore, events identi- 
fied by the classifiers in each search must be re-processed ac- 
cording to the algorithm described above for each population 
of sources, Sj. 
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D. Simulation results 



After multiple search methods are used on the data injected 
with gravitational-wave signals, the events selected by each 
search are processed with the algorithm sketched in the previ- 
ous subsection. To estimate the background for the combined 
search, we again perform time shifts of the LI data with re- 
spect to data from HI and H2. Although the time shifts per- 
formed in unci are independent for each classifier, the time 
shifts must be synchronized for all classifiers when estimat- 
ing the background for the combined search. For this purpose, 
100 5-second time shifts of the LI data are performed with 
respect to the HI and H2 data. The background sample is 
processed with the same algorithm as the main data. 

As previously mentioned, we consider three target popula- 
tions of compact binaries, categorized by their total mass: the 
binary neutron stars with total mass 2-6 Mq, the compact bi- 
naries with total mass 6-100 Mq, and the binaries with total 
mass 100-350 A/©. These define three independent searches. 
The data with injected signals from each category are ana- 
lyzed independently. The resulting events are ranked by the 
likelihood ranking, Eq. (|5]l, with Sj being one of the consid- 
ered target populations. The background events are ranked 
and combined in the same way, providing an estimate of the 
background for the combined searches. 

To compare combined searches with the individual search 
methods, we compute their sensitivities to the signals in the 
presence of typical background. We summarize this in Fig- 
ures [lJ{3l which show curves of visible volume versus false 
alarm rate for each of the search methods and for the com- 
bined search. For each point on these curves, the calculation 
proceeds as follows. First, using background events, we de- 
termine the value of the rank corresponding to a given false 
alarm rate. Next, the efficiency as a function of distance to 
the source, e{D), is estimated by the fraction of the signals 
at distance D ranked above that value. This efficiency is then 
converted to the visible volume. 

The binary neutron star search is a case study in which only 
one of the classifiers, namely the low-mass CBC pipeline, is 
effective in detecting the particular type of gravitational-wave 
signal. All other classifiers are designed to detect either black 
hole binaries or short duration bursts and are inefficient in de- 
tecting the long inspiral signal sweeping through the whole 
LIGO frequency band. This is properly accounted for in the 
likelihood-ratio ranking, which is very low for all events from 
these other classifiers. The only events not de-weighted are 
those from the low-mass CBC pipeline. As a result, the com- 
bined search is equivalent to the low-mass CBC search in 
this case. The sensitivity curves for both searches, shown on 
Figure [T] coincide. This shows that our algorithm is robust 
against uninformative, nuisance classifiers. 

The picture changes dramatically for compact binaries in 
the medium mass range, shown in Figure |2] In this case, the 
efficiency of the low-mass CBC pipeline is negligible in com- 
parison to the other classifiers. In this category, the Coherent 
WaveBurst pipeline has the best overall sensitivity. Further in- 
spection reveals that the high-mass CBC pipeline is the most 
sensitive of the three in the 6-50 Mq mass region, whereas 
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FIG. 1. Visible volume versus false alarm rate for binary neutron 
stars. The shaded area around a curve represents its la Poisson error. 
The "Combined" and the "Low Mass CBC" curves coincide, whereas 
all other curves drop to near zero in visible volume. 
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FIG. 2. Visible volume versus false alarm rate for CBC with total 
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the ringdown pipeline, despite being subdominant, tends to 
detect signals with high mass ratio that are either missed or 
not ranked high enough by the other pipelines. Thus, in this 
case, all but one classifier contribute detected signals to the 
combined search (the "x-x" curve on Figure|2l). This is a de- 
sired effect of incorporating the detection sensitivities of dif- 
ferent pipelines, which results in a more sensitive and robust 
combined search. 

We observe similar effects for the high-mass binaries, al- 
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visible volume of the most sensitive pipeline in each mass bin, 
shown in the lower pane of Figure |4] 



10" 10' 
False Alarm Rate (1 / yr) 



FIG. 3. Visible volume versus false alarm rate for CBC with total 
mass 100-350 Mq. The shaded area around a curve represents its 
Icr Poisson error. 



though this is not obvious from Figure [3] The figure shows 
the Coherent WaveBurst pipeline dominating over the ring- 
down or the high-mass CBC pipelines. The sensitivity curve 
of the combined search tends to be just above the Coherent 
WaveBurst curve and occasionally drops below it. However, 
these drops are well within the error bars. The detailed in- 
vestigation shows that the high-mass CBC and the ringdown 
pipelines still contribute detections of extra signals missed by 
the Coherent WaveBurst pipeline. In particular, the ringdown 
pipeline has the highest sensitivity of all searches in the 270- 
350 Mq region. Most of these extra signals are in the near or 
mid range zone (less then 100 Mpc) and therefore do not con- 
tribute as much to the total visible volume as those at far dis- 
tances. As a result, overall gain for the combined search is not 
that significant when compared to the Coherent WaveBurst 
search alone. Moreover, occasionally, due to background fluc- 
tuations, the threshold for the combined search fluctuates up- 
ward, which results in a loss of a few distant signals detected 
by the Coherent WaveBurst pipeline. The loss of visible vol- 
ume associated with these signals is not compensated by the 
gain of efficiency in the mid range. We should note that the 
measurement of detection efficiency for signals beyond 150 
Mpc has large uncertainties due to low counts for detected sig- 
nals. Therefore, the actual loss of efficiency in this case may 
be overestimated. 

For demonstration of these effects and further insight, we 
plot cumulative 50% efficiency contours on the distance/total 
mass plane for signals to be detected above the threshold, Fig- 
ure|4] The threshold is set by the lowest measured false-alarm 
rate of 0.28 events per year (corresponding to the left most 
point on the sensitivity curves in Figures [TJ-O. In Figure |4] 
the contour for the combined search envelops contours of the 
other pipelines. Furthermore, we calculate the corresponding 
visible volume for the joined search and plot its ratio to the 
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FIG. 4. The upper pane shows the cumulative 50% efficiency con- 
tours at false-alarm rate of 0.28 events per year. The combined search 
contour envelops the single classifier contours from above. The lower 
pane shows the plot of the ratio of the visible volume of the combined 
search to the visible volume of the most sensitive classifier in each 
mass bin. Having a ratio greater than one indicates that the combined 
search gains sensitivity across the entire mass range. 



IV. CONCLUSIONS 

We consider the problem of combining outputs of partially 
redundant search methods analyzing the same data in the con- 
text of gravitational wave searches. We suggest that the likeli- 
hood ratio, Eq. (|2|i, provides a natural unified ranking for the 
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candidate events identified by the search methods. It has a 
straightforward interpretation — events from each method are 
ranked according to the ratio of the method's sensitivity to its 
background. After forming the joined list of candidate events, 
calculation of the posterior probability distribution or an up- 
per Umit on the rate of gravitational-wave emissions can pro- 
ceed exactly as it would for a single search method. Therefore, 
there is never a problem consistently accounting for multiple 
trials in the analysis. If the combined search is interpreted as a 
counting experiment, the procedure for calculating the upper 
limit from multiple searches is similar to that suggested in 
In that case, classifiers contribute in proportion to their sen- 
sitivities. Additionally, our method accounts for information 
about the classifier's background, which is important when 
dealing with experimental data containing non-Gaussian noise 
artifacts. 

We test our procedure by simulating a search for gravi- 
tational waves from compact binary coalescence in the data 
from LIGO's S4 science run. We combine outputs from four 
search methods — the low-mass CBC, high-mass CBC, ring- 
down, and Coherent WaveBurst analysis pipelines — analyz- 
ing data with injected gravitational-wave signals from com- 
pact coalescing binaries in a wide range of masses. We 
find that our algorithm is robust against nuisance pipelines 
— those that are not sensitive to the targeted gravitational- 
wave sources. Moreover, the combined search proves to have 
greater or comparable sensitivity to any individual pipeline. 
In the simulations, we observe that the pipelines we use con- 
tribute different events to the total count of detected signals, 
thus increasing robustness and the overall probability of de- 
tecting gravitational waves from coalescing binaries. This ef- 
fect is especially pronounced for sources in near to mid range 
distances. Overall, our simulations show that searches for 
gravitational waves from coalescing binaries can benefit from 
combining results of multiple analysis methods by means of 
the likelihood-ratio statistic. 
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Appendix A: Multivariate likelihood-ratio statistic 

The maximum likelihood-ratio procedure described in Sec- 
tion|II]does not account for potentially useful information con- 



tained in the correlations between classifiers. In this section, 
we derive the formal expression for the multivariate statistic 
that is optimal by construction and includes all available infor- 
mation. We discuss some of its properties and its relation to 
the maximum likelihood-ratio statistic, Eq. (O. 

In the absence of internal thresholds, each classifier assigns 
a rank, r^, to every data sample. In this case, the vector of 
ranks, r = (ri, r2, . . . r„), can be interpreted as the reduced 
experimental data, and the problem of combining searches 
is analogous to a detection problem. The general problem 
of detection in the presence of arbitrary noise was discussed 
in detail in iQ]. Here, we state the final result and refer the 
interested reader to 101 for derivation and further discussion. 
The optimal solution, assuming the Neyman-Pearson criteria 
that requires the maximization of the signal detection proba- 
bility at a fixed rate of false alarms, ranks data samples by the 
likelihood-ratio detection statistic. For n classifiers, this takes 
the form 



A(ri,r2, . . . ,r„) = 



/p(ri,r2,...,r„|h,l)p(h|l)dh 



P(ri,r2, 



|0) 



(Al) 

where h stands for a gravitational-wave signal, 
p(ri, 1-2, . . . , r„ I h, 1) is the probability distribution for 
the vector of detection statistics (ri,r2, . . . ,r„) in the case 
when the gravitational-wave signal h is present in the data, 
p{ri , 12, . . . , r„ I 0) is the analogous distribution for the noise, 
and p{h \ 1) is the distribution of signal parameters for the 
targeted population of gravitational-wave sources. 

The joint likelihood ratio, Eq. (lAlb . includes the output 
from all classifiers and by construction provides the optimal 
ranking. We can simplify the expression in Eq. dAlb by not- 
ing that for n = 2, 



A(ri,i- 



/p(ri,r2|h,l)p(h|l)dh 
1-2 10) 

_l/ /p(ri|r2,h,lMh|l)A(r2,h)dh 
^\ P(ri|r2,0) 

/p(r2|ri,h,l)p(h|l)A(ri,h)dh \ 
P(r2|ri,0) J' 
(A2) 

and extending linearly for n > 2. 

In practice, computing the conditional probabilities in 
Eq. ( IA2l i can be a nontrivial task. Moreover, as the num- 
ber of classifiers increases, computing the necessary condi- 
tional probabihties becomes a less and less viable option 
and it is necessary to develop an approximation. The max- 
imum likelihood-ratio procedure can be regarded as such 
even though it does not follow from the multivariate expres- 
sion dAlb in a straightforward way. In order to find a re- 
lation between the two, it is useful to consider the limiting 
cases of Eq. (IA2I) . First, assume there is no correlation be- 
tween the measurements made by each classifier In that case, 
p{ri I Tj, h, 1) = p{Ti I h, 1) (and similarly for the denomina- 
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tors), so that 



A(ri, ra) = J A{ii,h)A{T2,h)p{h \ 1) dh ; 



A(ri)A(r2) . 



(A3) 

Factorization in the last step is justified because only one of 
the classifiers exhibits a non-trivial response in the presence 
of a signal in the data. As a consequence, the un-marginalized 
likelihood ratio, A(r, h), for that classifier is a function of the 
signal, h, sharply peaked around the true parameters of the 
signal, whereas the other likelihood ratio is almost constant. 
At the other extreme, consider the case of two strongly corre- 
lated classifiers. Then 

/p(ri, 12 I h)p{h I 1) dh _ S{y, - r2) Jp{i-2 I h)p(h | 1) dh 



P(ri,r2 I 0) 



S{ti - r2)p(r2 I 0) 



A(ri). 



(A4) 



Both cases can be easily generalized for n > 2 classifiers. 

When classifiers are strongly correlated, the maximum like- 
lihood ratio is trivially equivalent to iA4\ . In the opposite 
case, the absence of correlation between the classifiers implies 
their complementarity. If one classifier identifies a significant 
event, the others do not. This means that typically only one of 
the likelihood ratios in the product in Eq. iA3\ will be signif- 
icantly different from unity. Therefore, in this case, picking 
the maximum of the single classifier likelihood ratios or cal- 
culating their product has similar effect. Based only on these 
extreme situations, it is difficult to determine how good of an 
approximation the maximum likelihood ratio is in the interme- 
diate case. Nevertheless, we conjecture that the truly useful 
information can only be in correlations between the classifiers 
using incomplete, but complementary, information about the 
signal (e.g. template-based searches using inspiral and merger 
or ringdown waveforms). Even in this situation, the inclusion 
of correlations should be a next-order effect. 



Appendix B: Maximum likeliliood-ratio statistic 

One can gain further insight into the statistic defined by 
Eq. (O by mapping the ranks, r^, to their likelihood ratios, 
Ai(ri). The mapping is defined by Eq. (|2]i. The data space of 
the combined search is A = (Ai, A2, . . . ,An). For the i^^ 
classifier, the probabilities of detection and of false alarm are 
given by 



Pl = Je (A, - A*) p{A I l)p(l) dA (Bl) 

P^= Je (A, - A:)p{A I O)p(O) dA . (B2) 
and for the combined search by 

Fi = / e ('max(A) - A;!)p(A| l)p(l)dA (B3) 



where A* and A* are detection thresholds determined from 
the threshold value for the false alarm probability, Pq, which 
is the same for all classifiers. 

The efficiency of the combined search is expected to be, at 
the very least, no less than the efficiency of any of the clas- 
sifiers being used (Pi > P^). This is a necessary condition 
for the maximum likelihood-ratio procedure to be applicable. 
To get a better understanding of what this condition implies 
and when it is expected to hold, consider the simple case of 
combining a pair of classifiers. This can be generalized in 
a straightforward way to arbitrary number The data space, 
in this case, is a positive quarter in the (Ai, A2) space. The 
lines of constant likelihood ratio. A;, are horizontal or verti- 
cal lines. The lines of constant joint likelihood ratio, given 
by Eq. (lAlb . can be complicated curves even in the (Ai, A2) 
plane and define the optimal detection surfaces. The corre- 
sponding surfaces for the maximum likelihood-ratio statistic 
form a square, centered at the origin, with sides parallel to 
the Ai and A2 axes. This configuration is visualized on Fig- 
ure |5] where Aj (vertical dashed line), Aj (horizontal dashed 
line), and A* (dotted line) are the thresholds corresponding 
to a particular value of the probability of false alarm, for sin- 
gle and combined searches respectively. Detection regions for 
each classifier consist of all points for which the argument of 
the theta function in the expressions for detection and false 
alarm probabilities, Eqs. (IB 11 1 and (IB2b . is positive. The de- 
tection region for the i^^ classifier is defined by the condition 
Ai > A*. All data points in the plane satisfying this condition 
are counted as detection of a signal. The detection region for 
the combined search defined by Eqs. ( |B3t and ( IB4l i consists of 
the points satisfying two conditions: Ai > A* and A2 > A*. 
Recall that the false alarm probability for all searches is the 
same, which implies that 



p(Ai,A2|0) dA 



p(Ai,A2|0)dA, (B5) 



where Vi and Vc denote the detection regions for either of the 
individual searches and for the combined search respectively. 
This implies that A* > A^ — the threshold for the combined 
search is higher than the threshold for any of the individual 
pipelines. Indeed, if it was not true, then Eq. ( IB5b could not 
be satisfied since Vi C Vc- This would correspond to mov- 
ing the vertical dashed line to the right of the solid square in 
Figure |5] as an example for classifier Ai. Thus, the diagram 
shown in Figure |5] represents the only allowed configuration. 
Continuing with the classifier Ai, one can identify the set of 
points gained by the combined search, V+, which are points 
not included in Vi, and the set of points lost, V-, those be- 
longing to Vi but not to Vc- Both sets are shown in Figure|5]as 
shaded regions. It is clear that the efficiency of the combined 
search will be greater or equal to the efficiency of the classifier 
Ai if and only if 



p(Ai,A2|l)dA> / p(Ai,A2|l)dA. (B6) 

v+ Jv- 



Po = I O (^max(A) - A* ) p{A \ O)p(O) dA , (B4) 



Note that at the same time 



p(Ai,A2|0)dA= / p(Ai,A2|0)dA, (B7) 
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FIG. 5. The detection thresholds can be visualized for individual and 
combined searches in the (Ai, A2) space. For individual searches, 
the detection threshold appears as the vertical dashed line (Ai) and 
horizontal dashed line (A2). The A* threshold is the dotted line. The 
shaded regions represent the data points gained, V+ , and lost, V- , by 
the combined search when referencing a search performed with Ai. 



by virtue of Eq. ( IB5l l. Thus, Eq. ( IB6I 1 states that the joint like- 
lihood for the points in V+ must be (on average) greater than 
or equal to the joint likelihood for the points in V-. For this 
case, exchanging V+ for V- results in a positive gain. It is not 
unjustified to expect that Eq. ( IB6I ) would be satisfied in most 
practical situations. After all, according to the classifier A2, 
points in V+ have a better chance of being a signal than those 
in because A2 for any point in V+ is greater than Ai for 
any point in In effect, when combining searches using 
the maximum likelihood-ratio approximation, one exchanges 
points from V- with decent Ai and low A2 in favor of points 
in V+ with low Ai but high A2. Consider an extreme case 
when the classifier A2 is not informative. The probability of 
getting a high likelihood ratio in the absence of the signal is 
very low. Then, the total probability jy^ p(Ai, A2 | 0) is neg- 
ligible, effectively making V- a null set. This implies robust- 
ness of the approximation against nuisance, non-informative 
classifiers. The above steps can be mirrored for the classifier 
A2, resulting in the same conclusions. 

In conclusion, we should stress that, although condi- 
tion ( IB6I 1 does not hold in general, it is expected to be satisfied 
when combining well designed classifiers that are sufficiently 
different to be able to complement each other's detection effi- 
ciencies. These are the typical cases that arise in practice. 
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